Live freelance tracking. Raw descriptions turned into structured data. Find your next tech project without the noise.
upwork.com 🟢 2026-05-02
🔹 Extract Revenue and Operating Profit from Income Statement Image in PDF
👤 Client: 🇨🇦 Canada Member since 2013-12-23
💰 Price: $50
🚩 Problem: Automate extraction of financial metrics (Revenue and Operating Profit) from an image within a PDF document and store them in a SQLite database.
📦 Existing: [Not specified]
Specifications:
[Stack] PyMuPDF, pytesseract, pandas, sqlite3
[Method] Optical Character Recognition (OCR) to extract text from the image, then parse and store relevant financial metrics.
[Format] SQLite database with fields: company_name, year, metric, source_pdf
Workflow:
Install required libraries: PyMuPDF, pytesseract, pandas, sqlite3
Read the provided sample PDF using PyMuPDF to extract pages containing income statements
Apply OCR on each relevant page using pytesseract to convert images into text
Parse extracted text to identify and extract Revenue and Operating Profit for 2024
Store extracted data (company name, year, metric, source_pdf) in a new SQLite database
Test the script with another PDF containing different company's income statement